13 research outputs found

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    euHCVdb: the European hepatitis C virus database

    Get PDF
    The hepatitis C virus (HCV) genome shows remarkable sequence variability, leading to the classification of at least six major genotypes, numerous subtypes and a myriad of quasispecies within a given host. A database allowing researchers to investigate the genetic and structural variability of all available HCV sequences is an essential tool for studies on the molecular virology and pathogenesis of hepatitis C as well as drug design and vaccine development. We describe here the European Hepatitis C Virus Database (euHCVdb, ), a collection of computer-annotated sequences based on reference genomes. The annotations include genome mapping of sequences, use of recommended nomenclature, subtyping as well as three-dimensional (3D) molecular models of proteins. A WWW interface has been developed to facilitate database searches and the export of data for sequence and structure analyses. As part of an international collaborative effort with the US and Japanese databases, the European HCV Database (euHCVdb) is mainly dedicated to HCV protein sequences, 3D structures and functional analyses

    Potential and limits of in silico target discovery-Case study of the search for new antimalarial chemotherapeutic targets.

    No full text
    In medical sciences, a target is a broad concept to qualify a biological entity and/or a biological phenomenon, on which one aims to act as part of a therapy. It follows that a target can be defined as a phenotype, a biological process, a subcellular organelle, a protein or a protein domain. It also follows that a target cannot be defined independently of the type of intervention one considers implementing. In this brief review, we describe how in silico organization of genomic and post-genomic information of all partners involved in malaria (human patient, Plasmodium parasite and Anopheles vector), complying with knowledge of the disease in etiologic terms, appears as an efficient source of information not only to help selecting but also discarding target candidates. Some limitations in our capacity to explore the stored biological information, due to the current quality of genomic annotation, level of database integration, or to the performances of existing analytic and mining tools, are discussed. In silico strategies to assess the feasibility of bringing a target to a therapeutic development pipeline, in terms of target "druggability", are introduced

    TULIP software and web server : automatic classification of protein sequences based on pairwise comparisons and Z-value statistics

    Get PDF
    A configuration space of homologous protein sequences (or CSHP) has been recently constructed based on pairwise comparisons, with probabilities deduced from Z-value statistics (Monte Carlo methods applied to pairwise comparisons) and following evolutionary assumptions. A Z-value cut-off is applied so as proteins are placed in the CSHP only when the similarity of pairs of sequences is significant following the Theorem of the Upper Limit of a score Probability (TULIP theorem). Based on the positions of similar protein sequences in the CSHP, a classification can be deduced, which can be visualized as trees, called TULIP trees. In previous case studies, TULIP trees where shown to be consistent with phylogenetic trees. To date, no tool has been made available to allow the computation of TULIP trees following this model. The availability of methods to cluster proteins based on pairwise comparisons and following evolutionary assumptions should be useful for evaluation and for the future improvements they might inspire. We developed a web server allowing the local or online computation of TULIP trees based on the CSHP probabilities. The input is a set of homologous protein sequences in multi-FASTA format. Pairwise comparisons are conducted using the Smith-Waterman method, with 100-1,000 sequence shuffling to estimate pairwise Z-values. Obtained Z-value matrix is used to infer a tree which is then written to a file. Output consists therefore of a Z-value matrix, a distance matrix, a TULIP treefile in NEWICK format, and a TULIP tree visualisation. The TULIP server provides an easy-to-use interface to the TULIP software, and allows a classification of protein sequences based on pairwise alignments and following evolutionary assumptions. TULIP trees are consistent with phylogenies in numerous cases, but they can be inconsistent for multi-domain proteins in which some domains have been conserved in all branches. Thus TULIP trees cannot be considered as conventional phylogenetic trees, following the MIAPA (Minimum Information About a Phylogenetic Analysis) recommendations. A major strength of the TULIP classification is its statistical validity when analysing samples including compositionally unbiased and biased sequences (i.e. with biased amino acid distributions), like sequences from Plasmodium falciparum. The TULIP web server is a service of the Malaria Portal of the University of Pretoria, South Africa, and is available at http://malport.bi.up.ac.za/TULIP

    How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling.

    Get PDF
    International audienceWhatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon.TO STUDY HOW THESE FEATURES MIGHT INFLUENCE PHYLOGENY RECONSTRUCTIONS, WE EXAMINED A PARTICULAR POPULAR METHOD: the Fitch-Margoliash algorithm, which belongs to the Least Squares methods. We show that the Least Squares methods are closely related to Multi Dimensional Scaling. Indeed, criteria for Fitch-Margoliash and Sammon's mapping are somewhat similar. However, the prolific research in Multi Dimensional Scaling has definitely allowed outclassing Sammon's mapping.Least Square methods for tree reconstruction can now take advantage of these improvements. However, "false neighborhood" and "tears" are the two main risks in dimensionality reduction field: "false neighborhood" corresponds to a widely separated data in the original space that are found close in representation space, and neighbor data that are displayed in remote positions constitute a "tear". To address this problem, we took advantage of the concepts of "continuity" and "trustworthiness" in the tree reconstruction field, which limit the risk of "false neighborhood" and "tears". We also point out the concentration of measured phenomenon as a source of error and introduce here new criteria to build phylogenies with improved preservation of distances and robustness.The authors and the Evolutionary Bioinformatics Journal dedicate this article to the memory of Professor W.M. Fitch (1929-2011)

    Conservation, characterisation and management of grapevine genetic resources: the European GrapeGen06 project

    No full text
    International audienceThe genomic revolution have been the starting point for the renewal of attention on the extent of natural occurring variation in many plants species. In this regards, the use of genetic diversity in grape may greatly improve our understanding of quantitative economically important traits. Genetic resources for grape are very numerous, in particular in Europe, but the exact extent of diversity is still difficult to know. In the scope of the European GrapeGen06 project, research institutes and grapevine collections from 17 countries of Europe, Caucasus and North-Africa are working together for the characterization and management of grapevine genetic resources, including wild accessions of Vitis vinifera subsp. sylvestris. The initial work consisted of inventorying and describing the varieties in the partners’ collections, using standardized morphological, agronomic, and molecular descriptors, and to link the information into a unique European web-database. The project is in particular focusing on old autochthonous varieties. The database website, numbering more than 25000 accessions, is open for consultation, with different levels of access that allows in-dept searches but also confidentiality, according to the user needs. At the end of 2010, the fingerprint data will also be online, so that other collections or professionals can check their own varieties against the referenced varieties to verify names and true-to-typeness. The final objective of the European project is to promote an optimized conservation scheme of the Vitis germplasm, involving ex-situ, cryo- and on-farm conservation, so that the resources are permanently maintained, easily accessible and field-tested in a pertinent agricultural context. This network of resources will also provide plant material as a basis for biotechnology, genomic and breeding research

    Phylogeny and Sequence Space: A Combined Approach to Analyze the Evolutionary Trajectories of Homologous Proteins. The Case Study of Aminodeoxychorismate Synthase

    Get PDF
    International audienceDuring the course of evolution, variations of a protein sequence is an ongoing phenomenon however limited by the need to maintain its structural and functional integrity. Deciphering the evolutionary path of a protein is thus of fundamental interest. With the development of new methods to visualize high dimension spaces and the improvement of phylogenetic analysis tools, it is possible to study the evolutionary trajectories of proteins in the sequence space. Using the data-driven high-dimensional scaling method, we show that it is possible to predict and represent potential evolutionary trajectories by representing phylogenetic trees into a 3D projection of the sequence space. With the case of the aminodeoxychorismate synthase, an enzyme involved in folate synthesis, we show that this representation raises interesting questions about the complexity of the evolution of a given biological function, in particular concerning its capacity to explore the sequence space

    Toxoplasma gondii acyl-lipid metabolism: de novo synthesis from apicoplast-generated fatty acids versus scavenging of host cell precursors

    No full text
    Toxoplasma gondii is an obligate intracellular parasite that contains a relic plastid, called the apicoplast, deriving from a secondary endosymbiosis with an ancestral alga. Metabolic labelling experiments using [(14)C]acetate led to a substantial production of numerous glycero- and sphingo-lipid classes in extracellular tachyzoites. Syntheses of all these lipids were affected by the herbicide haloxyfop, demonstrating that their de novo syntheses necessarily required a functional apicoplast fatty acid synthase II. The complex metabolic profiles obtained and a census of glycerolipid metabolism gene candidates indicate that synthesis is probably scattered in the apicoplast membranes [possibly for PA (phosphatidic acid), DGDG (digalactosyldiacylglycerol) and PG (phosphatidylglycerol)], the endoplasmic reticulum (for major phospholipid classes and ceramides) and mitochondria (for PA, PG and cardiolipid). Based on a bioinformatic analysis, it is proposed that apicoplast produced acyl-ACP (where ACP is acyl-carrier protein) is transferred to glycerol-3-phosphate for apicoplast glycerolipid synthesis. Acyl-ACP is also probably transported outside the apicoplast stroma and irreversibly converted into acyl-CoA. In the endoplasmic reticulum, acyl-CoA may not be transferred to a three-carbon backbone by an enzyme similar to the cytosolic plant glycerol-3-phosphate acyltransferase, but rather by a dual glycerol-3-phosphate/dihydroxyacetone-3-phosphate acyltransferase like in animal and yeast cells. We further showed that intracellular parasites could also synthesize most of their lipids from scavenged host cell precursors. The observed appearance of glycerolipids specific to either the de novo pathway in extracellular parasites (unknown glycerolipid 1 and the plant like DGDG), or the intracellular stages (unknown glycerolipid 8), may explain the necessary coexistence of both de novo parasitic acyl-lipid synthesis and recycling of host cell compounds

    Cannabis Use Is Inversely Associated with Overweight and Obesity in Hepatitis B Virus-Infected Patients (ANRS CO22 Hepather Cohort)

    No full text
    International audienceBackground: Chronic hepatitis B virus (HBV) infection may evolve into cirrhosis and hepatocellular carcinoma, and this progression may be accelerated by specific risk factors, including overweight and obesity. Although evidence for a protective effect of cannabis use on elevated body weight has been found for other populations, no data are available for HBV-infected patients.Aims: We aimed to identify risk factors (including cannabis use) for overweight and obesity in patients with HBV chronic infection.Methods: Using baseline data from the French ANRS CO22 Hepather cohort, we performed two separate analyses, one using “central obesity” (based on waist circumference) and the other “overweight” and “obesity” (based on body mass index) as outcomes. Logistic and multinomial regressions were used to model central obesity and overweight/obesity, respectively.Results: Among the 3706 patients in the study population, 50.8% had central obesity, 34.7% overweight, and 14.4% obesity. After multivariable adjustment, current cannabis use was associated with a 59% lower risk of central obesity compared with no lifetime use (adjusted odds ratio [95% CI]: 0.41 [0.24 to 0.70]). It was also associated with a 54% and 84% lower risk of overweight (adjusted relative risk ratio [95% CI]: 0.46 [0.27 to 0.76]) and obesity (0.16 [0.04 to 0.67]), respectively.Conclusions: Cannabis use was associated with lower risks of overweight and obesity in patients with HBV chronic infection. Future studies should test whether these potential benefits of cannabis and cannabinoid use translate into reduced liver disease progression in this high-risk population
    corecore